Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: new hotspot scheduler basic part #1870

Merged
merged 14 commits into from
Nov 4, 2019
Merged

Conversation

Luffbee
Copy link
Contributor

@Luffbee Luffbee commented Oct 30, 2019

What problem does this PR solve?

Currently, the hotspot scheduler use the number of hot regions to make schedule decision, which cannot cover "many warm region" case, and may make wrong schedule decisions.

What is changed and how it works?

Use store flow byte rate to make decision.

Check List

Tests

  • Unit test

Related changes

  • Need to be included in the release notes

@Luffbee Luffbee added type/enhancement The issue or PR belongs to an enhancement. component/schedule Scheduling logic. labels Oct 30, 2019
@Luffbee
Copy link
Contributor Author

Luffbee commented Oct 30, 2019

/rebuild

Copy link
Contributor

@nolouch nolouch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reset LGTM.

server/statistics/store.go Outdated Show resolved Hide resolved
@Luffbee
Copy link
Contributor Author

Luffbee commented Oct 31, 2019

/rebuild

@nolouch
Copy link
Contributor

nolouch commented Nov 1, 2019

PTAL @rleungx

server/schedulers/hot_test.go Outdated Show resolved Hide resolved
server/statistics/store.go Outdated Show resolved Hide resolved
Copy link
Member

@rleungx rleungx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest LGTM.

// the flow of the region exceeds the threshold, the scheduler will update the region in
// the hot cache and the hotdegree of the region will increase.
for storeID, items := range storeHotPeers {
storeHots, ok := stats[storeID]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storeHots sounds not good enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about hotRegions? Or hotPeers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are better than storeHots and I prefer to unify the naming of HotPeerStat and storeHotRegionsDefaultLen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unified similar names.

@codecov-io
Copy link

codecov-io commented Nov 1, 2019

Codecov Report

Merging #1870 into master will decrease coverage by 0.17%.
The diff coverage is 76.8%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1870      +/-   ##
==========================================
- Coverage   78.03%   77.85%   -0.18%     
==========================================
  Files         161      161              
  Lines       16054    16126      +72     
==========================================
+ Hits        12527    12555      +28     
- Misses       2542     2573      +31     
- Partials      985      998      +13
Impacted Files Coverage Δ
server/schedulers/hot_region.go 88.57% <100%> (+2.8%) ⬆️
pkg/mock/mockcluster/mockcluster.go 90.34% <100%> (+0.13%) ⬆️
server/handler.go 53.82% <100%> (-0.5%) ⬇️
server/coordinator.go 83.56% <40%> (+5.94%) ⬆️
server/statistics/store.go 80.82% <52%> (-15.26%) ⬇️
server/api/trend.go 68.75% <66.66%> (ø) ⬆️
server/schedulers/shuffle_hot_region.go 57.6% <87.5%> (-4.19%) ⬇️
server/heartbeat_streams.go 69% <0%> (-6%) ⬇️
server/api/admin.go 65.62% <0%> (-5.81%) ⬇️
pkg/etcdutil/etcdutil.go 82.6% <0%> (-5.8%) ⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8b2225d...3d42a48. Read the comment docs.

server/statistics/store.go Outdated Show resolved Hide resolved
server/statistics/store.go Outdated Show resolved Hide resolved
server/statistics/store.go Outdated Show resolved Hide resolved
server/statistics/store_hot_peers_infos.go Outdated Show resolved Hide resolved
Copy link
Member

@rleungx rleungx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@lhy1024 lhy1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rest LGTM

server/statistics/store_hot_peers_infos.go Outdated Show resolved Hide resolved
//| 4 | 6MB |
//| 5 | 0MB |
//| 6 | 0MB |
tc.UpdateStorageWrittenBytes(1, 7.5*MB*statistics.StoreHeartBeatReportInterval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statistics.StoreHeartBeatReportInterval is too long.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that, but I didn't find any better way to solve it.

Co-Authored-By: lhy1024 <admin@liudos.us>
@Luffbee Luffbee added the status/can-merge Indicates a PR has been approved by a committer. label Nov 4, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Nov 4, 2019

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Nov 4, 2019

@Luffbee merge failed.

@Luffbee
Copy link
Contributor Author

Luffbee commented Nov 4, 2019

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Nov 4, 2019

/run-all-tests

@sre-bot sre-bot merged commit 058e445 into tikv:master Nov 4, 2019
@Luffbee Luffbee deleted the new-hot-basic branch November 4, 2019 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/schedule Scheduling logic. status/can-merge Indicates a PR has been approved by a committer. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants